-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Ensure swap_nonoverlapping
is really always untyped
#137412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
r? @ibraheemdev rustbot has assigned @ibraheemdev. Use |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
The Miri subtree was changed cc @rust-lang/miri |
src/tools/miri/tests/pass/stdlib/swap_nonoverlapping_is_untyped.rs
Outdated
Show resolved
Hide resolved
Passing this one along because I'm not the best person to review this. r? libs |
I'm mostly sticking to the trivial PRs at the moment as my review capacity is limited. Re-rolling again. r? libs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code changes LGTM for the non-const
path; see the comment for the const
path.
I did not look at the tests; I think that needs an LLVM/codegen expert. @nikic maybe?
To be clear I only reviewed for correctness, I cannot judge whether all the tricks to get better performance actually work.
|
// Ensure we do better than a long run of byte copies, | ||
// see <https://github.com/rust-lang/rust/issues/134946> | ||
|
||
// CHECK-NOT: movb | ||
// CHECK-COUNT-8: movups{{.+}}xmm | ||
// CHECK-NOT: movb | ||
// CHECK-COUNT-4: movq | ||
// CHECK-NOT: movb | ||
// CHECK-COUNT-4: movl | ||
// CHECK-NOT: movb | ||
// CHECK: retq |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note for reviewers: the codegen tests here are more about demonstrating what actually happens on a variety of types, and the exact details don't matter that much.
Reviewing the rust code is enough to know that LLVM will swap it, but for example here what we're trying to see is that it's not just a huge row of movb
s like you can see in https://rust.godbolt.org/z/MKfxn1Tjr
c1b9092
to
9d37b4b
Compare
☔ The latest upstream changes (presumably #137848) made this pull request unmergeable. Please resolve the merge conflicts. |
This comment has been minimized.
This comment has been minimized.
☔ The latest upstream changes (presumably #138155) made this pull request unmergeable. Please resolve the merge conflicts. |
@bors try |
Ensure `swap_nonoverlapping` is really always untyped This replaces rust-lang#134954, which was arguably overcomplicated. ## Fixes rust-lang#134713 Actually using the type passed to `ptr::swap_nonoverlapping` for anything other than its size + align turns out to not work, so this goes back to always erasing the types down to just bytes. (Except in `const`, which keeps doing the same thing as before to preserve `@RalfJung's` fix from rust-lang#134689) ## Fixes rust-lang#134946 I'd previously moved the swapping to use auto-vectorization *on bytes*, but someone pointed out on Discord that the tail loop handling from that left a whole bunch of byte-by-byte swapping around. This goes back to manual tail handling to avoid that, then still triggers auto-vectorization on pointer-width values. (So you'll see `<4 x i64>` on `x86-64-v3` for example.) --- try-jobs: x86_64-gnu-distcheck
In `swap_nonoverlapping_short` there's a new `debug_assert!`, and if that's enabled then the `alloca`s don't optimize out.
Only change is skipping the test when the new |
Ensure `swap_nonoverlapping` is really always untyped This replaces rust-lang#134954, which was arguably overcomplicated. ## Fixes rust-lang#134713 Actually using the type passed to `ptr::swap_nonoverlapping` for anything other than its size + align turns out to not work, so this goes back to always erasing the types down to just bytes. (Except in `const`, which keeps doing the same thing as before to preserve `@RalfJung's` fix from rust-lang#134689) ## Fixes rust-lang#134946 I'd previously moved the swapping to use auto-vectorization *on bytes*, but someone pointed out on Discord that the tail loop handling from that left a whole bunch of byte-by-byte swapping around. This goes back to manual tail handling to avoid that, then still triggers auto-vectorization on pointer-width values. (So you'll see `<4 x i64>` on `x86-64-v3` for example.)
Rollup of 18 pull requests Successful merges: - rust-lang#137412 (Ensure `swap_nonoverlapping` is really always untyped) - rust-lang#138167 (Small code improvement in rustdoc hidden stripper) - rust-lang#138605 (Clean up librustdoc::html::render to be better encapsulated) - rust-lang#138682 (Allow drivers to supply a list of extra symbols to intern) - rust-lang#138904 (Test linking and running `no_std` binaries) - rust-lang#139423 (Suppress missing field error when autoderef bottoms out in infer) - rust-lang#139449 (match ergonomics: replace `peel_off_references` with a recursive call) - rust-lang#139507 (compiletest: Trim whitespace from environment variable names) - rust-lang#139530 (Remove some dead or leftover code related to rustc-intrinsic abi removal) - rust-lang#139560 (fix title of offset_of_enum feature) - rust-lang#139563 (emit a better error message for using the macro incorrectly) - rust-lang#139568 (Don't use empty trait names) - rust-lang#139580 (Temporarily leave the review rotation) - rust-lang#139589 (saethlin is back from vacation) - rust-lang#139592 (rustdoc: Enable Markdown extensions when looking for doctests) - rust-lang#139599 (Tracking issue template: fine-grained information on style update status) - rust-lang#139600 (Update `compiler-builtins` to 0.1.153) - rust-lang#139606 (Update compiletest to Edition 2024) r? `@ghost` `@rustbot` modify labels: rollup
Ensure `swap_nonoverlapping` is really always untyped This replaces rust-lang#134954, which was arguably overcomplicated. ## Fixes rust-lang#134713 Actually using the type passed to `ptr::swap_nonoverlapping` for anything other than its size + align turns out to not work, so this goes back to always erasing the types down to just bytes. (Except in `const`, which keeps doing the same thing as before to preserve ``@RalfJung's`` fix from rust-lang#134689) ## Fixes rust-lang#134946 I'd previously moved the swapping to use auto-vectorization *on bytes*, but someone pointed out on Discord that the tail loop handling from that left a whole bunch of byte-by-byte swapping around. This goes back to manual tail handling to avoid that, then still triggers auto-vectorization on pointer-width values. (So you'll see `<4 x i64>` on `x86-64-v3` for example.)
Rollup of 17 pull requests Successful merges: - rust-lang#137412 (Ensure `swap_nonoverlapping` is really always untyped) - rust-lang#138167 (Small code improvement in rustdoc hidden stripper) - rust-lang#138605 (Clean up librustdoc::html::render to be better encapsulated) - rust-lang#138682 (Allow drivers to supply a list of extra symbols to intern) - rust-lang#138904 (Test linking and running `no_std` binaries) - rust-lang#139423 (Suppress missing field error when autoderef bottoms out in infer) - rust-lang#139449 (match ergonomics: replace `peel_off_references` with a recursive call) - rust-lang#139507 (compiletest: Trim whitespace from environment variable names) - rust-lang#139530 (Remove some dead or leftover code related to rustc-intrinsic abi removal) - rust-lang#139560 (fix title of offset_of_enum feature) - rust-lang#139563 (emit a better error message for using the macro incorrectly) - rust-lang#139568 (Don't use empty trait names) - rust-lang#139580 (Temporarily leave the review rotation) - rust-lang#139589 (saethlin is back from vacation) - rust-lang#139592 (rustdoc: Enable Markdown extensions when looking for doctests) - rust-lang#139599 (Tracking issue template: fine-grained information on style update status) - rust-lang#139600 (Update `compiler-builtins` to 0.1.153) r? `@ghost` `@rustbot` modify labels: rollup
☀️ Test successful - checks-actions |
What is this?This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.Comparing 2205455 (parent) -> 0fe8f34 (this PR) Test differencesShow 161 test diffsStage 0
Stage 1
Stage 2
Additionally, 156 doctest diffs were found. These are ignored, as they are noisy. Job group index
Job duration changes
How to interpret the job duration changes?Job durations can vary a lot, based on the actual runner instance |
Finished benchmarking commit (0fe8f34): comparison URL. Overall result: ❌✅ regressions and improvements - please read the text belowOur benchmarks found a performance regression caused by this PR. Next Steps:
@rustbot label: +perf-regression Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 1.2%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (primary 0.7%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary 0.0%, secondary -0.1%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 783.243s -> 781.425s (-0.23%) |
Correctness fix, relatively small set of regressed scenarios, compared to number of improvements. Marking as triaged. |
Ensure `swap_nonoverlapping` is really always untyped This replaces rust-lang#134954, which was arguably overcomplicated. ## Fixes rust-lang#134713 Actually using the type passed to `ptr::swap_nonoverlapping` for anything other than its size + align turns out to not work, so this goes back to always erasing the types down to just bytes. (Except in `const`, which keeps doing the same thing as before to preserve `@RalfJung's` fix from rust-lang#134689) ## Fixes rust-lang#134946 I'd previously moved the swapping to use auto-vectorization *on bytes*, but someone pointed out on Discord that the tail loop handling from that left a whole bunch of byte-by-byte swapping around. This goes back to manual tail handling to avoid that, then still triggers auto-vectorization on pointer-width values. (So you'll see `<4 x i64>` on `x86-64-v3` for example.)
This replaces #134954, which was arguably overcomplicated.
Fixes #134713
Actually using the type passed to
ptr::swap_nonoverlapping
for anything other than its size + align turns out to not work, so this goes back to always erasing the types down to just bytes.(Except in
const
, which keeps doing the same thing as before to preserve @RalfJung's fix from #134689)Fixes #134946
I'd previously moved the swapping to use auto-vectorization on bytes, but someone pointed out on Discord that the tail loop handling from that left a whole bunch of byte-by-byte swapping around. This goes back to manual tail handling to avoid that, then still triggers auto-vectorization on pointer-width values. (So you'll see
<4 x i64>
onx86-64-v3
for example.)